Finding Groups in Large Data Sets

نویسنده

  • Adrian Müller
چکیده

This paper aims to give an overview of methods to find groups in large data sets, such as household expenditure survey data. These methods are grouped in three: cluster analysis, dimension reduction and basic explorative methods. The emphasis is put on a critical analysis and potential drawbacks, especially of inputs that have to be provided by the researcher. These may impose some structure not present in the data, thus defeating the purpose of revealing intrinsic patterns. In general, the more elaborate methods, such as cluster analysis, are delicate to apply, especially in the context of social sciences. Often, it may be best to limit oneself to more transparent approaches such as comparisons of basic statistics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Application of Benford’s Law in Analyzing Geotechnical Data

Benford’s law predicts the frequency of the first digit of numbers met in a wide range of naturally occurring phenomena. In data sets, following Benford’s law, numbers are started with a small leading digit more often than those with a large leading digit. This law can be used as a tool for detecting fraud and abnormally in the number sets and any fabricated number sets. This can be used as an ...

متن کامل

Solubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network

The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...

متن کامل

Solubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network

The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002